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Abstract 

We propose a single chain slip-spring model, which is based on the slip-spring model by 
Likhtman [A. E. Likhtman, Macromolecules, 38, 6128 (2005)], for fast rheology simulations of 
entangled polymers on a GPU. We modify the original slip-spring model slightly for efficient 
calculations on a GPU. Our model is designed to satisfy the detailed balance condition, which 
enables us to analyze its static or linear response properties easily. We theoretically analyze 
several statistical properties of the model, such as the linear response, which will be useful to 
analyze simulation data. We show that our model can reproduce several rheological properties 
such as the linear viscoelasticity or the viscosity growth qualitatively. We also show that the 
use of a GPU can improve the performance drastically. 

Keywords: Slip-Spring, Entangled Polymer, Linear Response Theory, GPU 

1 Introduction 

Polymeric liquids exhibit various interesting macroscopic flows. To simulate macroscopic flow 
behaviors of entangled polymeric materials, we need to incorporate microscopic or mesoscopic 
polymer models (which reproduces required rheological properties) with the macroscopic fluid 
model (such as the Cauchy equation). Considering the computational costs, phcnomcnological 
constitutive equation models are reasonable for macroscopic simulations. However, most of 
constitutive equation models involve rather rough or physically unclear approximations, which is 
not fully justified. To avoid such uncertainties, we can use microscopic or mesoscopic molecular 
models [5Ul2|. Although these molecular models also involve approximations, generally they require 
less phenomenological parameters, and we expect they are more precise than constitutive equation 
models. Thus we expect the use of molecular models instead of constitutive equation models can 
improve the macroscopic flow simulation. 

Several different methods have been proposed to incorporate the mesoscopic polymer mod- 
els with macroscopic fluid models. For example, the CONNFFESSIT type methods [13], the 
Lagrangian particle based methods [14, 15 , or recently developed finite volume based hybrid 
method 16, 17J were proposed and achieved success to simulate macroscopic flows of polymers. 
Because these models directly combine different models which have different time and length 
scales (microscopic or mesoscopic rheological model, and macroscopic fluid model), we call them 
"multiscalc" simulation models in this work. In these multiscale models, microscopic or mesoscopic 
rheological simulations arc directly used to calculate macroscopic quantity such as the stress ten- 
sor, instead of phenomenological constitutive equation models. In other words, microscopic or 
mesoscopic simulations are embedded to macroscopic fluid elements. Clearly such simulations re- 
quire many microscopic or mesoscopic rheological simulations which are numerically not efficient. 
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To simulate large scale and/or long time macroscopic phenomena, therefore it is demanding to 
perform many microscopic or mesoscopic simulations efficiently. 

One possible way is to use hardwares for acceleration. So far, several acceleration hardwares 
(such as MDGRAPE p3H2Q], ClearSpeed [SUES], or Cell Broadband Engine (23H2S]) have been 
developed and utilized to accelerate calculations. Recently, the acceleration by a graphic processor 
unit (GPU) [26 30 , which is called general purpose GPU (GPGPU) programming [31], is utilized 
to accelerate various calculations including molecular dynamics [321133] , fluid dynamics [34] , Monte 
Carlo models 35 , or stochastic differential equation models 36 . It is reported that GPUs can 
accelerate simulations drastically, although the efficiency strongly depends on a target. Because 
GPUs are much cheaper than other acceleration hardwares and new GPUs are developed continu- 
ously, they are considered to be promising compared with other previous acceleration hardwares. 
Thus we expect that rheological simulations on a GPU will be a good candidate for the embedded 
fast rheological simulation. 

In this work we employ the slip-spring model, which is originally proposed by Likhtman [10) to 
study the structure and dynamics of entangled polymers, for simulations on a GPU. We propose a 
slightly modified, single chain version of the slip-spring model. We design our model to be suitable 
for simulations on a GPU. We also design our model to fully satisfy the detailed balance condition. 
We derive several statistical properties such as equilibrium probability distributions or a formula 
for the relaxation modulus. By performing rheological simulations both on a CPU and on a GPU, 
we study rheological properties of the model and the acceleration effect by a GPU. It is shown that 
our model reproduces rheological properties reasonably and it enables efficient calculations on a 
GPU. 

2 Model 

Although there are various mesoscopic molecular models to calculate rheological properties of 
entangled polymers [SHU], not all models are suitable for simulations on a GPU. In this work, we 
employ the slip-spring model proposed by Likhtman [10 . In the Likhtman's original slip-spring 
model, polymer chains are expressed as ideal non-interacting Rouse chains and the entanglement 
effect is mimicked by slip-springs. One end of a slip-spring is fixed in space and another end is 
attached to the polymer chain. As we will show later, such a model is suitable for calculations on 
a GPU. 

In this section, we show a single chain slip-spring model, which is a variant of the original 
slip-spring model. We modify the Likhtman's model slightly to make it suitable for simulations on 
a GPU. To implement simulation programs on a GPU, we need to use GPU specific programming 
environment and it has several limitations. Thus we design our dynamics model to be as simple 
as possible. At the same time, to make the model physically natural and to make static properties 
simple, we attempt to make the dynamics model to satisfy the detailed balance condition. We also 
discuss about the equilibrium statistical properties and the linear response theory of the slip-spring 
model, which is useful to calculate the linear viscoelasticity. 

2.1 Free Energy, Grand Potential, and Equilibrium Statistics 

We first describe the model of our single chain slip-spring model. We model an entangled polymer 
chain by an ideal Rouse chain and slip-springs. The conformation of a polymer chain is expressed 
by positions of beads {Ri}, where i — 1,2, ...,N is the bead index (N is the total number of 
beads). The conformation of slip-springs are expressed by two set of variables; the bead indices of 
slipl-spring {Sj} and the anchoring point positions {Aj}, where j = 1, 2, . . . , Z with Z > being 
the total number of slip-springs. One end of the j-th slip-spring is attached to the Sj-th bead of 
a polymer chain, and another end is anchored in space, at Aj. We assume that Sj is an integer 
value and 1 < Sj < N. The total number of slip-springs Z is not constant because slip-springs can 
be spontaneously constructed or destructed. Thus variables {Ri}, {Aj}, {Sj}, and Z are required 
to specify a state uniquely. 
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To make the statistical mechanical properties of the model clearly, we first describe the free 
energy of a single chain with slip-springs. The free energy of a polymer chain and attached slip- 
springs can be expressed as follows. 
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A 3 ) 



(1) 



where fcs is the Boltzmann constant, T is the temperature and b is the bead size (segment size). N s 
is a parameter which represents the strength of the slip-springs. As mentioned, the total number of 
slip-springs, Z, is not constant but fluctuates in time since slip-springs are dynamically constructed 
or destructed. To handle such a variable, it is convenient to use grand canonical type ensemble 
for slip-springs, which is originally introduced by Schieber |37| for the slip-link model. The grand 
potential of the system can be expressed as follows. 



J{{Ri}, {A,}, {Sj}, Z) = T{{Ri\, {Aj}, {Sj}, Z) - vZ 



(2) 



where v is the effective chemical potential for a slip-spring. As long as the detailed balance condition 
is satisfied, all the equilibrium statistical properties can be calculated from the grand potential © . 
As we discuss later, we can design dynamics to satisfy the detailed balance condition, and thus all 
the equilibrium properties shown in the followings can be also applied to our dynamics model. 

Before performing numerical simulations, we analytically show several equilibrium statistical 
properties of our single chain slip-spring model. In this section, we show equilibrium and linear- 
response properties. Here we may emphasize that the results in this section is based on the usual 
equilibrium statistical physics and the linear response theory in non-equilibrium statistical physics. 
As long as the detailed balance condition holds, all the results in this section also hold. 

By using the grand potential ([2]), we can calculate the grand partition function of the system. 
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where V is the system volume. A and A s are the thermal de Brogle wave lengths for a bead and 
a slip-spring, respectively. (They are required to make the grand partition function dimensionless. 
As we will show, thermodynamic properties are not affected by them.) The factorial Z\ is the 
Gibbs factor due to the indistinguishably of slip-springs. The equilibrium distribution of a state is 
given as the following usual Boltzmann form. 
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Here the equilibrium distribution function is normalized to satisfy the following normalization 
condition. 

]T f d{R t }, d{A 3 } P eq ({i?J, {Aj}, {Sj}, Z) = 1 (5) 
{s 3 },z 

The equilibrium average or distribution of a given set of variables can be calculated from eq 
((3]). The equilibrium distribution of the chain conformation (the bead positions {Ri}), P cq ({Ri}) , 
becomes 
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Eq ((6|) is the same as the equilibrium conformation of an ideal chain. Thus we find that in equi- 
librium our model can be reduced to a single ideal chain. Therefore all the equilibrium statistical 
properties of a chain (such as the average end to end vector or the average radius of gyration) are 
just the same as ones of an ideal chain. This is consistent with the fact that the entanglement 
effect is just a dynamic effect and does not affect static properties. 

Similarly, we can calculate equilibrium statistical properties of slip-springs. The equilibrium 
distribution of the number of slip-springs, P cq (Z), is given as follows. 



P cq (Z) = E / d{R t }d{A,}P cq ({R t },{A 3 },{S 3 },Z) 
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where we defined the modified effective chemical potential v as 
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From eq ([7]) we find that the equilibrium distribution of Z is expressed as a Poisson distribution. 
This is consistent with the distribution function in the slip-link model by Schieber |37j . The average 
number of slip-springs (Z) cq ((. . .) oq means the equilibrium statistical average) can be related to 
the modified effective chemical potential v. 



J2 ZP cq (Z) = 
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We expect that the average number of slip-springs can be expressed by using the characteristic 
number of beads between slip-springs, Nq. 
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Here we note that generally N in eq (fTTTj) does not coincide with the number of beads between 
entanglements calculated from the plateau modulus, N e . (Typically No is smaller than N e [TU1138] .) 
From eqs (|9]) and (fit)]) , we have the following relation between N and v. 
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Eq ([7|) can be then rewritten simply as 
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By using eqs ([6|) and (fT2|). we can rewrite eq (|4]) as follows. 

z 

P e<l ({Ri}, {A,}, {Sj}, Z) - P cq ({R})P cq (Z) 11 P eq (^)Pe q (^|{i^}, Sj) 

i=i 

Pcq(Sj) 



N 



Peq{Aj\{Ri}, Sj) 



2irN,b 2 



3/2 



exp 



2N,b 2 



(R Sj 



(11) 
(12) 

(13) 

(14) 
(15) 



Here, the notation P eq (X\Y) represents the conditional probability of X for given Y. Eq (Ti"4"|) 
is the equilibrium distribution of a bead index of a slip-spring. Eq (|14[) can be interpreted as 
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the equilibrium distribution of an ideal gas particle on a one dimensional lattice which has N 
lattice points. Eq (|15j) is the equilibrium distribution of an anchoring point under a given chain 
conformation and a bead index. 

To study rheological properties, we need the microscopic expression for the stress tensor. From 
the stress-optical rule, the stress tensor of a single chain can be expressed by using the bead 
positions. 

N-l „, rp 

tr({Ri}) = - - Ri) ~ Nk B Tl (16) 

i=l 

The second term in the right hand side of eq P^)l does not have the non-diagonal elements and thus 
it can be dropped when we consider the shear stress or the normal stress difference. In equilibrium, 
the stress tensor of the system can be calculated to be 

c <o-({i^})) eq = -cok B Tl (17) 

where cq is the average density of polymer chains, (cq can be related to the average bead density 
po as co = po/N). From eq (|17|) we find that in equilibrium, our single chain slip-spring model 
gives the stress tensor of an ideal gas system with the number density cq. This is physically natural 
because the entanglement effect is purely kinetic, and the system is the same as a non-interacting 
Rouse chain system in equilibrium. 



2.2 Dynamics 

We cannot study dynamical properties only from the grand potential ((2]). In this subsection, we 
introduce a simple but physically valid dynamics model for a single chain slip-spring model. As 
we mentioned, the purpose of this work is to design a simulation model which is suitable for the 
calculations on a GPU. In this work, we will employ the CUDA programming model 26-28 which 
is developed and provided by NVIDIA corporation and widely used for GPGPU calculations. 

Although the CUDA programming model provides us a fast and efficient GPGPU environment, 
there are several (rather strict) limitations. Before we design the dynamics model, here we briefly 
review some restrictions in CUDA which are directly related to the design of our dynamics model. 
First, parallel threads on a GPU are segmented into several blocks (each block contains typically 
from several tens to several hundreds of threads) . The data communication between different blocks 
are much slower compared with the communication inside a block. Thus the data communication 
between blocks should be reduced to achieve high performance. Second, parallel threads are 
basically designed to perform the same task (with different data values) and the complicated 
conditional branches decrease the performance. Third, the amount of registers and shared memory 
is not large (the total registers and shared memory are 8192 (32kB) and 16kB per one block, 
respectively). Fourth, some arithmetic operations are not implemented, or not efficient compared 
with a CPU. (Although these restrictions depend on the GPU architecture, there are qualitatively 
similar restrictions for other GPGPU calculation models.) The dynamics model shown in this 
subsection is designed achieve high performance calculations on a GPU under these limitations 
(see Section EO)) . 

Although we introduce a specific dynamics model in this subsection, equilibrium properties 
(shown in the previous subsection) are not altered for other dynamics models as long as the 
detailed balance condition is satisfied. 

We start from the dynamics of beads. In the absence of slip-springs (Z — 0), we expect that 
the dynamics reduces to the Rouse dynamics. Then the dynamics of beads can be modelled as the 
simple Rouse type dynamics. We use an overdamped Langevin equation as the dynamic equation 
for 0j bend 

jggg _ {y ■ «,>. «) + K(t) . a, + ftW (18) 

at c, otti 

where £ is the friction coefficient for a bead, n(t) is the velocity gradient tensor, and is the 
Gaussian white noise. £i(t) satisfies the following fluctuation-dissipation relation which guarantees 
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the detailed balance. 
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where (...) means the statistical average and 1 is the unit tensor. 

Dynamics of slip-springs is not trivial. Because we are seeking a model which is suitable for 
simulations on a GPU, here we employ rather simple dynamics for slip-springs. We assume that 
the anchoring point of a slip-spring moves only by the external flow. Thus the dynamic equation 
for Aj becomes the following simple advection equation. 



dAj{t) 
dt 



n(t) ■ Aj 



(21) 



In the absence of the velocity gradient, the anchoring points do not change their positions unless 
they are reconstructed. For the dynamics of slip-spring bead indices, we employ stochastic jump 
processes [SJI3S]. For simplicity, we assume that a slip-spring bead index can move only to its 
neighboring bead indices by a single jump. The jump probability from the bead index Sj to the 
bead index Sj can be expressed as a transition matrix W(S'j\Sj). 




(S'j=S j + l) 
(S'j = Sj - 1) 
Ws-iSj) (S'j=Sj) 
(otherwise) 



(22) 



where Ws+(Sj) and Ws-(Sj) are transition probabilities which increment or decrement the bead 
index, respectively. These transition probabilities should satisfy the detailed balance condition. 
There are many possible forms for the transition probabilities which satisfies the detailed balance 



condition. In this work we employ the following Glauber type dynamics |40j 
that the Glauber dynamics satisfies the detailed balance condition.) 
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Here £ s is a parameter which represents the effective friction coefficient for a slip-spring. In con- 
trast to the Likhtman's model, we allow slip-springs to pass through each other. However, as 
mentioned by Likhtman, this does not affect physical properties qualitatively. From the view point 
of numerical calculations, this enables numerical schemes to be simple and thus we can make the 
implementation on a GPU simple. 

Finally we model the slip-spring reconstruction dynamics. Slip-springs on chain ends (Sj = 
1,N) can be removed from a chain and destructed. To compensate the destruction process, we 
have to introduce the slip-spring construction process on chain ends. These reconstruction events 
can be modelled as the jump processes, just like the dynamics for {Sj}. If we assume that just one 
slip-spring can be destructed or constructed at one reconstruction event, the jump probabilities 
become as follows. 



W Z (Z'\Z) = 
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where Wz+(Z) and Wz-(Z) are the construction and destruction probabilities, respectively. From 
the detailed balance condition, they should satisfy the following relation. 



W z+ (Z)P cq ({R t } 7 {Aj}, {Sj}, Z) = W Z -(Z)P cq ({R t }, {A,}, {Sj}, Z + 1) (26) 

Although the condition (|2"B1) limits the form of dynamics, there are still many possible candidates 
for the reconstruction probabilities. In this work, we employ the following rather simple jump 
probabilities for the slip-spring reconstruction process. 



k T 1 f 3 \ 3/2 

W Z+ (Z) = -j-(8a M+lA + Ss z+1 ,n) Wo [^y 2 J exp \-^(R Sz+i - A z+1 f 
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Wz-(Z) = -j- ^2(Ss„i + S S] ,n) (28) 

It is straightforward to show that eqs (|2"T)) and (f2"5)l togather with eq (JTSJ) satisfy the condition 
(f2"6"l) . As we will show later, the transition probabilities (|27l) and (I28|) enable simple numerical 
implementations which are suitable to simulations on a GPU. 

The single chain slip-spring model shown in this subsection can reproduce reptation type dy- 
namics [T] qualitatively. We emphasize that each dynamics described in this subsection satisfies 
the detailed balance condition, and thus our single chain slip-spring model has well-defined ther- 
modynamic equilibrium state. 

Although we did not incorporate the effects such as the constraint release (CR) or the convective 
constraint release (CCR) into our model, it is not difficult to take these effects and improve the 
model. Interestingly, while the CR process is not explicitly considered, the relaxation process which 
is qualitatively similar to the CR exists in our model. (As shown in Appendix \K\ the dynamics 
of bead indices exhibit CR type relaxation mechanism.) Our model can reproduce rheological 
properties reasonably, as we will show later. 
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2.3 Linear Response Theory 

To obtain linear response functions such as the shear relaxation modulus around equilibrium, the 
linear response theory [41] is sometimes quite useful. The linear response theory states that if the 
dynamics satisfies the detailed balance condition, the response of the system to a small perturbation 
is simply expressed by correlation functions (the fluctuation-dissipation relation). Because our 
model is designed to satisfy the detailed balance condition, we can utilize the fluctuation-dissipation 
relation to calculate the response in our model. (Some slip-link models do not satisfy the detailed 
balance condition. For such models, the validity of the linear response theory is not guaranteed 
and it may not be useful.) In this subsection, we show the explicit expression for the relaxation 
modulus by using the standard linear response theory. 

The dynamics of our single chain slip-spring model is described by the Langevin equation 
and jump processes. In the absence of the velocity gradient tensor the system can relax to the 
equilibrium state, because the detailed balance condition is satisfied. Now we consider the velocity 
gradient tensor, n{t), as a time-dependent small perturbation and calculate the response of the 
time-dependent stress tensor <r(t) to n(t). 

The probability distribution function is useful to calculate the linear response. We define the 
time-dependent probability distribution function in the phase space as 



P({r t },{ai},{si},z;t) 
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The time evolution of the probability distribution function can be given as the following master 
equation. 

^P({rJ, {a,}, { Sj }, z; t) = [C R (t) + C A (t) + £ s + C z ] P({n}, {a,}, { Sj }, z; t) (30) 
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where Cr{€) and Ca (t) are the Fokker-Planck type operators, and Cs and Cz are the time evolution 
operators which come from jump processes for {Sj} and Z. Notice that and £a(£) depend 

on time t whereas Cs and Cz do not. This is because the dynamics of {Sj} and Z is not directly 
affected by the external flow. The explicit forms for Cn{t) and £a(£) are as follows. 
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We do not show the explicit forms for Cs and Cz here, because their explicit forms are not required 
in the following calculations. 

To calculate the linear response, we need to decompose the time evolution operator into the 
equilibrium and perturbation parts. Thus we define the following two time evolution operators. 
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By using these operators, the time evolution equation of the probability distribution function can 
be simply described as follows. 

^ = [C +Cx(t)]P (35) 

In the absence of the velocity gradient, the perturbation part of the time evolution operator dis- 
appears (C\(t) = 0) and the system relaxes to the equilibrium state. The equilibrium distribution 
function P cq is given as the Boltzmann form since the detailed balance condition is satisfied. 
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Using the equilibrium distribution function we write the equilibrium average for an operator B as 
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Now we can follow the standard procedure to calculate the linear response [HJEZI- ^ ne time- 
dependent stress tensor of a single chain is calculated by the following stress tensor operator. 
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Following the standard linear response theory, we have the following expression for the time- 
dependent stress tensor. (See Appendix iBl for detail.) 
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where <x oq = (<r) cq = — fc^Tl is the equilibrium value of the stress tensor. After straightforward 
calculations, finally we have the following expression for the time-dependent stress tensor. 



<r(t) = ^c q + 7^= / dt' \(&(t - t')&) cq + (&(t - t')&^)J : n(t') (40) 

where <x(t) is the stress tensor operator evolved by time t (the time-shifted stress tensor operator) 
and &( v > is the virtual stress tensor operator by slip-springs and : means the dyadic product. &^ 
is defined as 

2 — 1 

Eq (|4"Tj) is similar to eq but it represents the contribution of slip-springs. The relaxation 
modulus tensor G{t) (which is a fourth order tensor) corresponds to the response function of the 
stress to the velocity gradient. In this work we use the following definition for the relaxation 
modulus tensor. 

c [cr(t) - o-e q ] - f dt' G(t - t') : nit') (42) 

./ — OO 

From eqs fj4Q|) and (|42]l. we find that the relaxation modulus tensor is simply given as 



G{t) = rrww** + ^f(^ {v) u (43) 



It should be noted here that there are two contributions for the relaxation modulus tensor. One is 
the usual stress-stress autocorrelation function, and another is the stress- virtual stress correlation 
function which does not appear in usual many body systems. 

Here we note that eq (|43[) coincide with the formula previously proposed by Ramirez, Suku- 
maran and Likhtman 43 . Intuitively, this form can be understood as follows. The stress tensor 
operator <r is not conjugate to the perturbation n(i), but the sum of the stress and the virtual 
stress operators, <x -I- &( v \ is conjugate to the perturbation. Then, following the standard formula 
of the linear response theory, the response function is given as the time correlation function of 
<r and <r + ffW, which gives eq (j40|) . If we employ cr + <xW as the stress tensor of a chain, the 
relaxation modulus tensor is simply given as the autocorrelation function of & + &t v K However, 
such a definition violates the stress-optical rule, and it is physically unnatural. (We will examine 
the contribution of the virtual stress is quantitatively, later.) Therefore, in this work we use cr as 
the stress tensor operator. 



3 Numerical Scheme and Implementation 

3.1 Discretization Scheme 

To solve dynamics of our single chain slip-spring model numerically, we need to discretize dynamic 
equations and jump processes. In this subsection, we briefly show the discretization scheme. Since 
our purpose in this work is to develop a model which is suitable for calculations on a GPU, here 
we aim to make rather simple and stable schemes. 

Before considering discretization schemes, we make all the parameters dimensionless. We set b — 
1, fceT = 1, and £ = 1. (This is equivalent to make all the dimensional parameters dimensionless 
by characteristic scales b, ksT, and £.) The characteristic time scale of simulations also becomes 
unity (r = C,b 2 /k B T = 1). 

To make schemes simpler, we split dynamics from t to t + At (with At being the time step 
size) into several substeps. In this work, we use the following three substeps to evolve the system 
from t to t + At. Each substep is designed to satisfy the detailed balance condition for small At 
(At < 1). 
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1. Integrate dynamics equations for {Ri}, {Aj} (eqs ([TBI and (|2T]|), These are stochastic differ- 
ential equation and ordinary differential equation and thus we can employ several standard 
schemes. To minimize the computational cost, here we employ the explicit Euler method 
which is not accurate but the simplest. 



Ri(t + At) 
Aj(t + At) 
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dJ({R i },{A j },{S i },Z) 
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where Wi is a standard distribution random number vector which satisfies the following 
relation. 
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Wi can be generated easily by some random number generators such as the combination of 
the linear coagulation method and the Box-Mullcr transform. 

Move slip-spring bead indices {Sj} by transition probabilities (|22[) -(|24 |) . We use the following 
accumulated probabilities for finite time step At and uniform distribution random variables. 
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(48) 



(49) 



The minima are taken so that each transition probability does not exceed 1/2. Although the 
discretization scheme shown here is not accurate, it does not fail even for large At due to 
this trick. 



Reconstruct slip-springs on chain ends by the transition probability eq (|25|) . (|27|) . and (|28[) . 
First, new slip-springs are constructed at chain ends by the following accumulated probability. 
The construction is attempted K times repeatedly (with K being an integer parameter which 
can depend on Z). This parameter K is introduced to allow the construction attempt 
evaluated in parallel. This is because parallel attempts are much efficient than single attempt 
on a GPU. (See Section GOD 
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(50) 
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The minima are taken to prevent each transition probability from exceeding 1/2. The an- 
chored point position of a new slip-spring, Az+i, is sampled from the equilibrium distribu- 
tion. Since the equilibrium distribution of an anchoring point is Gaussian (eq (|15p ). it can 
be easily generated. 
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Second, each slip-spring is destructed by the following accumulated probability. 




= mm 




} 



(53) 



As before, the minimum is taken to avoid each transition probability to exceed 1. We reset 
indices of slip-springs after the constructions or destructions. This ensures that the index for 
slip-springs on a chain always runs from 1 to Z (J = 1, 2, . . . , Z). 

The discretization schemes shown here are not accurate compared with more advanced schemes. 
(For example, it is possible to improve the accuracy by employing more advanced and accurate 
discretization schemes such as the stochastic Runge-Kutta method [H].) In this work, we prefer 
inaccurate but simple schemes for an implementation on a GPU. Since discretization schemes are 
designed to be as simple as possible, it is not difficult to implement them for GPGPU calculations 
(in which we have several limitations due to the GPU architecture). Although the accuracy is not 
high, we can perform stable simulations with these schemes even for not small At. (That is, the 
simulations do not fail easily even for rather large At.) 

Before starting simulations we have to prepare well equilibrated samples. Since we know the 
equilibrium probability distribution, we can directly generate the equilibrium conformation of 
chains and slip-springs easily. All the simulations performed in this work start from the equilibrium 
state generated based on eq (|13l) . The sampling procedure is as follows. 

1. Generate polymer conformation {Ri} by sampling from the Gaussian distribution (eq (p])). 

2. Sample Z from the Poisson distribution (eq (|12J>). 

3. Generate Z slip-springs. Each segment index, Sj, is sampled from the uniform distribution 
(eq (HU)) and each anchoring point, Aj, is sampled from the Gaussian distribution (eq ([15])). 

To obtain statistical quantities form Langevin type simulations, we need to perform many 
simulations with the same parameter set and different random number series. For this purpose, we 
simulate M different polymer chains and calculate the statistical average of an arbitrary physical 
quantity B approximately as 



where the subscript k means that the variable is of the fc-th sample chain (for example, Rk,i is the 
i-th bead position of the fc-th chain) . 

3.2 Implementation on GPU 

As we mentioned, we use the CUDA programming model to implement the discretized single chain 
slip-spring model on a GPU. In the CUDA programming model, there are two memory spaces on a 
graphics card; the global memory and the shared memory. The shared memory is small (currently 
its size is just 16kB) but the access speed is fast (comparable to the access speed of the registers). 
The global memory is large but the access speed is rather slow. Thus it is required to reduce the 
access to the global memory to speed up the program. Fortunately, our single chain slip-spring 
model requires very small memories and thus most of the data can be stored in the registers or the 
shared memory. To make the CUDA program efficient, it is also required to achieve high parallelism 
because a GPU has many threads (typically about several hundreds or several thousands) which 
are executed simultaneously. We can implement highly parallelized program by allocating one bead 
and one slip-spring to one CUDA thread. Although this limits the maximum number of slip-springs 
(Z < N), in most cases it cause no problems (the probability that Z > N is practically negligible 
if No is not small). Here we note that, it is difficult to implement in such a way if we employ the 
slip-link type singe chain model [6H91ITT]. because nodes are dynamically reconstructed and thus 




(54) 
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the number of total nodes is not constant. This is one of the main reason why the slip-spring 
model is employed in this work. 

For the slip-spring reconstruction scheme on a GPU, we set the number of attempts to construct 
slip-springs as K = N — Z. This means that each thread which do not have a slip-spring attempts 
to construct a new one (and each attempt can be evaluated in parallel). For a CPU, we set K = 1, 
which means that the slip-spring construction is attempted only once at each step. Clearly this 
is suitable for a CPU, because we cannot evaluate many attempts in parallel. (Strictly speaking, 
this implementation will violate the detailed balance condition. But the violation is practically 
negligible.) 

Currently it is generally not efficient to use double precision floating-point numbers ( "double" 
type variables in CUDA), and thus single precision floating-point numbers (the "float" type vari- 
ables) should be used to accelerate simulations efficiently. Fortunately this limitation is not serious 
for our single chain slip-spring model. Because our model is based on the stochastic differential 
equations and jump processes, the accuracy is mainly determined by number of samples and the 
error is typically much larger than the error of single precision floating-point number operations 
(the machine epsilon is about efl oat 10~ 7 whereas the statistical error is roughly proportional to 
the inverse square root of sampling numbers). Several elementary mathematical functions can be 
calculated very efficiently on a GPU. In our implementation, functions such as sins, cos a;, exps, 
In x, or y/x are used. Although the fast calculations for these mathematical functions involve some 
errors, such errors are not serious in our simulations (the errors are typically comparable to one of 
single precision floating-point number operations). 

In parallelized programs, it is often needed to perform reduction operations. In our model, we 
have to calculate the total number of slip-springs on a chain or forces acting on beads caused by 
slip-springs. The reduction operations are one of the most time-consuming parts in our simulations. 
Although CUDA currently does not provide special functions for these reduction operations, we 
can use some techniques to improve the speed of the reduction operations [55]. 

Another time-consuming part is the calculation of the slip-spring force acting on a chain. To 
calculate the slip-spring force efficiently, we utilize the fixed-point real number technique [46] and 
the atomic operations |28j in CUDA. The fixed-point real number technique uses the integer type 
variable x as the real number y — x x efi X cd with £fi xo d being the resolution. The resolution efi xe d 
should be determined so that the truncation error is sufficiently small and the overflow does not 
occur. (Because the single precision float number has the machine epsilon £fl oa t ~ 10 -7 , efi xo d is 
not necessarily to be very small.) In this work we set efi xc d = 1/4096 sa 2.4 x 10 -4 , which has a 
sufficient resolution yet the overflow docs not occur even under fast shear rates. The results shown 
below are not sensitive to the value of efi XG dj as long as £fi xc d is not too small nor too large. 

4 Results 

4.1 Rheological Properties 

Before measuring the acceleration effect by a GPU, we calculate several rheological properties of 
our slip-spring model. To obtain reliable simulation results, in this subsection, all the simulations 
are performed on a CPU. (The discretization schemes on a CPU are almost the same as the schemes 
on a GPU.) We set the simulation parameters as follows; the entanglement bead number Nq — 4, 
slip-spring strength N s = 0.5, the slip-spring friction coefficient £ s = 0.1, and the time step size 
At = 0.01. 

We first calculate the linear viscoelasticity by using the linear response formula ([43)1. The 
storage and loss moduli, G'{uj) and G"(w), are then calculated by utilizing the following relations. 




(55) 



(56) 
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The integration in eqs (|55[) and (1561) are numerically calculated by using the trapezoidal rule. 
Figure [T] shows the linear viscoelasticity data calculated from equilibrium single chain slip-springs 
simulations (for N = 10, 20, 40, and 80). Notice that G'(lu) and G"(u) are normalized by p^ksT 
to be dimensionless forms (pa is the average bead density). As shown in Figure [TJ our model 
reproduces linear viscoelasticity of entangled polymers qualitatively well, especially considering 
the simplicity of the model. The plateau modulus is about s» O.lpofcsT, which is almost the 
same as the plateau modulus obtained by the original slip-spring model simulation |10j . This means 
that, the characteristic number of beads between slip-springs Nq in this model is not identical to 
the entanglement bead number calculated from the plateau modulus based on the rubber elasticity 
theory. Following the standard definition, we define N e via the following equation. 

_ p k B T 

iV 

From G$ « 0.1p o k B T, we have N e sa 10. This value is much larger than N (N e w 2.5N Q ). 
Therefore the plateau modulus or N e are not determined only by TVo, but they depends on various 
parameters in a rather complex way. Practically, it is reasonable to use both Nq and Go (or G^ ) 
as two independent fitting parameters [TU1I47] . 

Figure [5] shows the longest relaxation time calculated from the linear viscoelasticity data. The 
longest relaxation time is calculated via the following equation 

- = - lim lnG W0 (58) 

Td t->oo t 

For small N, we find that the longest relaxation time is proportional to (N — l) 2 , which agrees 
with the scaling of the Rouse relaxation time. (Here N — 1 is used instead of N, since N — 1 
corresponds to the number of bonds in a chain [48].) This indicates that for small N (from Figure 
[21 N < 10) a chain with slip-springs behave essentially as an unentalgled chain. For large N, the 
relaxation time depends on N — 1 as Td oc (N — l) 3 ' 48 , which is similar to experimental results, 
Td oc (N — I) 3 4 (larger than the prediction of the pure reptation theory pQ , T d oc ( N — l) 3 ). Thus 
we find that the longest relaxation time and the zero shear is also reasonably reproduced by our 
simulation model. 

Figure [3] shows the zero shear viscosity rjo- The zero shear viscosity is calculated from G(t) as 

POO 

r/o= / dtG(t) (59) 
Jo 

We find that the viscosity is proportional to N — 1 or (N — l) 3 - 40 for small or large N, respectively. 
The cross over bead number is roughly estimated to be N c ss 14.2 ss \AN e . This value is slightly 
smaller than the experimental value N c /Nq = 1.6 ~ 3.5 [49], but the discrepancy is not so large. 
Although we do not show simulation results for other parameter sets (for example, for different 
No, N s , or ( s ), as shown in Ref [TU1 the effects of these parameters are not large and we have 
qualitatively similar results. Thus in this work we limit ourselves only to the standard parameter 
set used in Ref HU1 

Next we calculate the viscosity growth 17(4,7) and the steady state viscosity 77(7). At t = 0, 
the system is in equilibrium. For t > we apply the constant shear flow as 

J 7 (a = x,P = y) 
[ (otherwise) 



and calculate 77(^,7) and 77(7) as follows. 



v(t) . )= {^m (61) 

7 

77(7) = lim r](t, 7) (62) 

t— >oo 
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We also calculate the dynamic viscosity, which is defined as follows, to check whether the Cox-Merz 
rule [50] holds or not. 



77(7) « rf(u,)| . (63) 



77* (w) = VG" 2 H+G" 2 (w) (64) 

Figure [4] shows the viscosity growth curves for ./V = 40 with various shear rates (7T0 = 0.0025, 
0.005, 0.01, 0.025, 0.05, and 0.1). The number of sample chains is M = 1024 for all shear rates. 
As shown in Figure 2] that viscosity growth curves can be qualitatively well reproduced by the 
single chain slip-spring model. The stress overshoot behavior and the shear thinning behavior are 
reasonably reproduced. 

Figure [5] shows the dynamic and steady state viscosities calculated for N — 10, 20, 40, and 80. 
For relatively small 7 and u>, we observe that the Cox-Merz rule (|63|) holds reasonably. However, we 
find that in high shear rate regions, the steady state viscosities deviate from the complex viscosity 
and the Cox-Merz rule does not hold. In such regions, the viscosities are nearly independent of 
shear rate, thus they are the second Newtonian viscosities. These second Newtonian viscosities 
can be understood as follows. In high shear rate regions, slip-springs as well as chains are strongly 
stretched and slip-springs are easily destructed. Thus there are only a few slip-springs on a chain. 
For example, for N = 80 ((Z) cq = 20), the steady state average number of slip-springs are (Z) = 
8.7,4.2, and 1.6 for 7 = 0.01,0.1, and 1, respectively. Then, a chain behaves essentially like an 
ideal, Rouse chain. The viscosity of a Rouse chain is independent of shear rate since the Rouse 
model is linear. Following this picture, we expect that the second Newtonian viscosity depends on 
N as 

v(i) a N ^ 

In Figure [5] we can observe that eq (|65[) approximately holds at the high shear rate region in 
Figure [5] Therefore we conclude that the second Newtonian like behavior in high shear rate region 
is an artifact of our model, rather than a physical property of an entangled chain. This means 
that our model should not be applied to very high shear rate regions where the Cox-Merz rule is 
violated. Results for the steady state viscosity may be utilized to validate simulation results under 
more complex situations, such as flows around obstacles. 

In the shear thinning region in Figure [5] the steady state viscosity obeys the power law 77(7) oc 
7~ Q . The exponent a is smaller than 1 and thus the steady state shear stress <J X y{i) oc •j 1 ~ a 
is a monotonically increasing function of the shear rate 7. In the simple Doi-Edwards type tube 
model PP, <J X y(i) is not a monotonically increasing function of 7, and such a non-monotonic 
relation can cause mechanical instability (shear-banding instability) [51] . It is known that the 
CCR mechanism can successfully remove this instability [52T53] , As we mentioned, the CR or CCR 
mechanisms are not explicitly taken into our model. Even in absence of the CCR mechanism, our 
model is free from the mechanical instability (at least in the investigated parameter range). 

Finally, we shortly investigate the contribution of the virtual stress tensor In Section |2~31 

we assumed that the stress tensor of the system is given by & defined by eq (|38[) . One may prefer 
to employ & + &( v > as the stress tensor of the system, which is conjugate to the velocity gradient 
tensor. Here we consider the shear relaxation modulus as an example. The relative contribution 
of the virtual stress to the shear relaxation modulus can be defined as 

A& V )(t) = ft^yhs + ^ xy )eq ^.gg^ 

{&xy (t^)&xy}eq "i~ (Pxy{p*)&xy )cq 

If AGM(t) is negligibly small, or if it is independent of t, the stress-optical rule approximately 
holds even if we employ a + a^ v ' as the stress tensor of the system. Figure |5] shows the relative 
contribution of the virtual stress for N = 40. As shown in Figure is roughly about 25% 

and this value is not negligibly small. However, it does not strongly depend on time t. This means 
that if we employ er+er^ as the stress tensor of the system, the stress-optical rule is approximately 
valid and rheological properties would be qualitatively not changed (quantitatively they would be 
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changed by about 25%). The results shown in this subsection are therefore qualitatively not 
sensitive to the definition of the stress tensor. 

Judging from obtained rheological data, we can conclude that our model reasonably reproduces 
rheological properties and can be used to study rheological properties or flow behaviors of entangled 
polymers. Although further improvement of the model will be possible, it is beyond the scope of 
the current work and left for a future work. 

4.2 Comparison of Rheological Properties with Original Slip-Spring Model 

In this subsection, we briefly compare rheological data calculated by our single chain slip-spring 
model and the Likthman's original slip-spring model. We compare several rheology data shown in 
Ref [TO] with the results of our simulations. 

To avoid numerical errors due to the fitting or the numerical integration, we compare the shear 
relaxation modulus instead of storage and loss moduli. Figure [T^a) shows the shear relaxation 
moduli calculated by our model and the original slip-spring model, for N = 8, 16, 32, 64 and 128. 
Other parameters (Nq,N s and £ s ) are the same. The data of the orignal model is taken from 
Fig. 4(a) of Ref ITO1 We can observe that the forms of G(t) calculated by our model and the 
original model are quite similar for long time region (t > IOto), while the longest relaxation times 
are quantitatively different. Our model gives longer relaxation time for all cases. We consider 
this is mainly due to the lack of the CR effect in our model. (There may be other reasons for 
this discrepancy, such as the effect of the short time scale dynamics. However we consider their 
contributions are not large compared with the CR effect.) 

Nonetheless, the relaxation behavior of our model is qualitatively similar to the original model 
with the CR effect. To see it clearly, we show the shear relaxation moduli shifted vertically and 
horizontally in Figure Efb). The data by our model (solid curves in Figure BJa)) are shifted (the 
data by the original model are not shifted). For relatively long time region (t > to), our data can 
be collapsed to the data obtained by the original model well. The horizontal and vertical shift 
factors, a and b, are of the order of unity (a « 1.9 and b m 0.8 for N = 128), and they slightly 
depend on N. This will be due to the difference between the dependeces of relaxation mechanisms 
to N. 

Figure [8] shows zero shear viscosities calculated by our model and the original model (with or 
without the CR effect). The data of the orignal model is taken from Fig. 5(a) of Ref [TO] We can 
observe that for large N, r]o by our model is close to the one by the original model without the CR. 
On the other hand, for small N, our model gives smaller rjg compared with the original model. We 
consider this is due to the difference of the dynamics of slip-springs. In our model, slip-springs can 
be attatched only on beads while in the original model, slip-springs can be attatched in between 
beads. Besides, our model allows slip-springs to pass through each other. The slip-spring dynamics 
of our model seems to give faster relaxation for small N, compared with the original model. 

From the above comparisons for G(t) and 770, we conclude that our model can reproduce rhe- 
ological properties of the original model qualitatively well, especially for well entangled polymers. 
The relaxation time or the zero shear viscosity are close to the data by the original model without 
the CR effect. This is natural since our model does not incorporate the CR effect. Nonetheless, 
the relaxation behavior of our model is almost the same as one of the original model in the long 
time region (t > 10r ). 

4.3 Acceleration by GPU 

In the previous subsection, we have shown that the single chain slip-spring model can reproduce 
rheological properties qualitatively. In this subsection we show the results of the acceleration 
by a GPU. To study the acceleration, we compare the calculation times on a CPU and on a 
GPU, by using the same parameter set. As an example, in this work we use Intel Core 2 Duo 
E8500 (3.16GHz, dual-core) for simulations on CPU, and NVIDIA Tesla C1060 (1.3GHz, 240 
CUDA cores) for simulations on GPU. Simulation programs are written in C and CUDA for CPU 
and GPU, respectively. They are compiled by using gec (version 4.3.0) and nvee (version 2.1) 
and executed on Linux (kernel 2.6.9, x86_64). The program for CPU is written in ANSI C and 
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no CPU-specific extensions (such as the SSE or SSE2 instruction sets [54]) are utilized. The 
numbers of threads used for the calculations are 1 (CPU) and 128 x 128 (GPU). For comparison, 
we performed simulations with the following parameters; the numbers of beads and chains N x M = 
16 x 4096,32 x 2048,64 x 1024, or 128 x 512 (the total number of beads is kept to be constant), 
the shear rate 7 = or 0.05. All the simulations are started from the equilibrium initial state (at 
time t = 0). The time step size is At = 0.01, and the simulations are stopped at time t — 100 (the 
total number of time steps is 10000). The results are summarized in Table [T] We can observe that 
the program for GPU is about 290 times faster than the program for CPU. The acceleration by a 
GPU is quite effective to accelerate our single chain slip-spring model. 

5 Discussion 

5.1 Model Properties 

We formulated a single chain version of the slip-spring model for entangled polymers. Our model 
is designed as the simplified model of the original one. One notable property of our model is that 
it fully satisfies the detailed balance condition. This means that the equilibrium probability dis- 
tribution rigorously becomes the Boltzmann distribution. This enables us to tune the equilibrium 
statistics of the model easily. For example, we can employ the statistics of a non-ideal chain (real 
chain) for our model, or we can introduce the interaction between slip-springs. Such modifications 
can be done essentially only by changing the expression for the grand potential |2]). 

Because it is reported that the statistics of entangled polymer chains somehow depend on 
models (such as primitive path extraction methods [551457] or dynamic equations [55]), it will be 
desirable for a model to be tunable for a specific target statistics. The equilibrium distribution 
function (|13|) has a rather simple structure and we can tune, for example, the statistics of the chain 
or the statistics slip-springs easily. We expect that our model can be used to investigate rheological 
behaviors for various chain statistics models numerically. 

The detailed balance condition also becomes important when we derive the linear response 
formula ([43]) . Although the response formula ([43|) itself is already proposed by Ramirez, Sukumaran 
and Likhtman [43], they did not give the derivation based on the master equation. We gave the 
rigorous derivation based on the master equation, which corresponds to the Langevin and jump 
dynamics actually used in the simulations. Our result justifies the use of the response formula (|43|) 
to calculate the relaxation modulus tensor. 

Although dynamics of slip-springs is modeled as a simple jump dynamics in our model, it can 
reproduce linear and nonlinear rheological behaviors qualitatively. The linear rheological proper- 
ties are similar to the original slip-spring model. This implies that our model captures essential 
nature of the original slip-spring model. We also performed simulations for nonlinear rheological 
properties, and reproduced the viscosity growth and the Cox-Merz rule. Thus we consider that 
our single chain slip-spring model can be used as long as we want to calculate simple rheological 
properties. 

In our model, the rheological properties can be reproduced well while there is no CR effect. As 
we already pointed, even without the CR effect, there is a CR like relaxation mechanism in our 
model (as shown in Appendix [AJ. This relaxation mechanism is caused by the model property that 
slip-springs can pass through (or exchange) each other. Our result implies that in some situations, 
this "constraint exchange" mechanism can be employed instead of the CR mechanism. From the 
numerical point of view, if we allow slip-spring to pass through each other, the implementation 
becomes much easier (this is because the time evolution of each slip-springs can be evaluated in 
parallel). It seems to not be difficult to make slip-springs exchangeable in other slip-link type 
models. The exchangeable slip-links will improve numerical accuracy efficiently. 

5.2 Acceleration by GPU 

We observed that the acceleration by a GPU can improve the simulation speed drastically. The 
program for GPU is about 290 times faster than the program for CPU, which seems to be quite 
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efficient and promising. However, it is fair to mention about the possible acceleration by some CPU- 
specific extensional instructions. Several CPU-specific special extensional instructions can improve 
the performance largely. For example, the SSE instruction handles 4 single precision floating-point 
operations in parallel [54. The use of the SSE and/or SSE2 will improve the performance roughly 
about 10 times (for single precision floating-point number operations). Besides, currently the 
program for CPU is not parallelized. Because we can achieve very high parallelism for a single 
chain type model, the performance can be further improved by the factor 2 (for a dual-core CPU) . 
This means, even if we tune the program for CPU extremely and parallelize it, the GPU program 
is still about 15 times faster than the CPU program. (We also note that we can use multiple CPUs 
in parallel, and it can also improve the performance.) 

Although it is possible to improve the program for CPU, typically programs with such special 
instructions become quite complicated. The portability of the program is also decreased if we use 
CPU-specific instructions explicitly. As a result, the programming cost for CPU becomes much 
larger than one for GPU. Judging from the acceleration effects and the programming costs, we 
can conclude that the single chain slip-spring model simulations on a GPU are very efficient and 
promising for practical purposes. 

Since simulations on a GPU enable us to calculate rheological properties (such as stress tensor) 
very efficiently, in principle we can perform CONNFFESSIT or particle type multiscale simula- 
tions 13 15 with reasonable calculation costs, by combining our single chain slip-spring model and 
macroscale fluid models. When we perform macroscopic fluid simulations in which many meso- 
scopic rheological simulations are embedded (typically several thousand mesoscopic simulators are 
embedded in a single fluid element), the mesoscale simulations are the most time-consuming part. 
If we perform such multiscale simulations only on CPUs, the required calculation time is still con- 
siderable even if we parallelize the mesoscopic simulator. Our simulation model and use of a GPU 
can decrease the mesoscopic calculation cost drastically. We expect that the total simulation time 
of multiscale simulations can be also reduced drastically. Besides, the mesoscopic rheological sim- 
ulations can be further accelearated by using multiple GPUs (because our simulations are already 
highly parallelized). Cooperating our model with macroscale fluid models will be future works. 

To study rheological properties of complicated systems, such as the rheology of branched poly- 
mers or polymer blends, we will need to refine our model. We will be also required to take into 
account of the CR or the CCR, for precise calculations under fast shear rates. For more complex 
architectures, such as star polymers or comb polymers, the generalization of the model will be re- 
quired. We consider that generalization itself is not so difficult, but it may be difficult to implement 
it for a GPU because there are several limitations for a GPU. To perform simulations efficiently 
on a GPU, we will need to design a generalized model so that it is suitable for calculations on a 
GPU. 

6 Conclusion 

In this work we proposed a single chain slip-spring model, which is based on the Likhtman's slip- 
spring model [10]. The model is designed to be suitable for simulations on a GPU. Besides, the 
model is expressed by using the free energy and satisfies the detailed balance condition, which 
ensures that the system relaxes to the thermal equilibrium state. We calculated several static 
properties (equilibrium distirbution functions) analytically. We also calculated the linear response 
of the system to strain deformation, and obtained the Green-Kubo type formula for the relaxation 
modulus which is in agreement with the one previously proposed by Ramirez, Sukumaran and 
Likhtman [4"31 . 

We calculated several rheological properties such as the linear viscoelasticity or the viscosity 
growth, and shown that our model can reproduce them reasonably. To accelerate the simulations, 
we performed simulations on a GPU as well as simulations on a CPU. By comparing the simulation 
times, we found that the use of a GPU can accelerate a simulation approximately 290 times faster. 
The modification of our model or the application to actual multiscale simulations will be future 
works. 



17 



Acknowledgment 



This work is supported by JST-CREST. The author thanks Mr. Ryuji Sakamaki for informing the 
author about the fixed-point real number technique and Ref 221 He also thanks to Prof. Yuichi 
Masubuchi for various helpful comments. 

A Constraint Release Type Relaxation Mechanism 

In many models for entangled polymers, the constraint release (CR) effect is considered to be 
an important effect (especially for branched polymers). Several methods have been developed to 
take the CR events into account. Rubinstein and Colby [59 modelled the CR events as hopping 
motions of tube segments in a self-consistent way. Based on this idea, Schieber and coworkers 
[6,9,11] introduced (relatively) slow diffusion type Brownian motion of slip- links as the CR events. 
Masubuchi and coworkers [7] directly modelled the CR events as reconstruction of slip-links between 
two chains. Doi and Takimoto [SJ, and Likhtman [TU] modelled the CR events as reconstruction 
events in a similar way (in their models, slip- links (slip-springs) are virtually paired). In our single 
chain slip-spring model, the CR process is not explicitly considered. However, as we discuss in this 
section, the CR type relaxation process exists (implicitly) in our model. 

One peculiar property of our model is that slip-springs can pass through each other, unlike the 
original slip-spring model. Most of lip-link based models do not allow slip-links to pass through 
each other. Similarly, it is usually not allowed in most of tube models to exchange the neighboring 
entanglement points. Then, we can expect that the passing-through events of slip-springs will 
result in a sort of relaxation process. 

We consider a passing-through event of two neighboring slip-springs. Here we label the slip- 
spring indices j in the following order (ascending in Sj) to compare our model with conventional 
models. 

Si < S 2 < S 3 < ■ ■ ■ < S z (67) 

This condition is (implicitly) assumed in many slip-link based models. If the j-th and (j + l)-th 
slip-springs are exchanged at time t, then we should exchange the slip-spring indices and anchoring 
points to satisfy the condition (|67|) . 

Sj (t + 0) = S j+i (i), S J+1 (t + 0) = Sj (t) (68) 
Aj(t + 0) = A J+ i(t), A j+1 (t + 0) = Aj(t) (69) 

This dynamics of two monomer indices, Sj and Sj+i, can be interpreted as the collision and 
reflection like dynamics. The dynamics of two anchoring points, Aj and Aj + \, can be interpreted 
as sudden jumps in space. We consider that such jump events are similar to the CR picture 
considered by Rubinstein and Colby [59]. Thus, we can interpret the exchange events as the CR 
like motions of anchoring points. Then the exchange events effectively give the CR like stress 
relaxation process. 

However, we should notice that the exchange events do not exactly correspond to the conven- 
tional CR events. For example, our CR like events are non-Markovian while the CR events are 
usually modelled as Markovian. (This is because after one exchange event, the same slip-spring 
pair can pass through each other again. Such a process results in the memory effect.) Thus the 
effect of the exchange events to the stress relaxation is expected not to be strong at long time 
scale. To fully take account of the conventional CR events, we will need to model the CR process 
in our model as another jump process. 

B Detailed Calculations in Linear Response Theory 

In this appendix, we show detailed calculations in the derivation of the linear response of the stress 
to the velocity gradient tensor. Although the calculations themselves are rather straightforward, 
the final result is not so intuitive. We show detailed calculations to avoid confusions. We note that, 
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following the same procedure, one can derive other linear response functions such as the dielectric 
response function. Mainly we follow the standard derivation of the linear response theory for the 
Fokker-Planck equation [42] . 

We may start from eq (|35| , the time evolution equation for the probability distribution function. 
If we decompose the probability distribution function into the equilibrium and perturbation parts, 

P({n}, {a,}, { Sj }, z- t) = P eq ({n}, {a,-}, { Sj }, z) + P x ({n}, {a,}, { Sj }, z; t) (70) 

the time evolution equation (|35l) can be approximately expressed as follows, by taking only the 
linear terms in the perturbation expansion. 



C P 1 +C 1 (t)P f 



cq 



(71) 



where we used that the equilibrium part of the time evolution operator, Co, and the equilibrium 
distribution function P oq satisfy the following equation. 



dP 



cq 



dt 



— CoPen — 



By integrating eq (ITlT) we have 

P 1 ({r i },{a j },{s j },z;t)= I dt'e (t - t '^C 1 (t')P cq 
The average time-dependent stress tensor is then calculated to be 

tr(t)= / d { r i} d { a ]}c rP (.{ r i}A a ]}A s ]}i z ^) 

= fd{ri}d{ aj }& P cq + f ^' e (*- t ') £o £ 1 (t') J Pc< 

r i J L J —oo 

= <r cq + f dt' Y [diriWaA&eW^&fflP^ 
where <x eq = (<r) eq - Eq (f74"f can be modified further by utilizing the following equation. 
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where we defined the virtual stress tensor operator tr^as 



dj_ 
da* 



a,j — k B Tl 



K(t)P 



cq 



(75) 



E3k B T 
— — (r Sj - aj )(r Sj - a : ) - zk B T\ 



3=1 



(76) 



Finally we have the following expression for the time-dependent stress tensor, and thus we have 
eq (HOD. 



1 

er(t) = cr cq + j-j; 

1 



dt' Y J d{ri}d{aj} &e^- t,)Co \[& + M v) ] : n(t')P ( 

z,{s,} 



cq 



oq k B T 



dt' / d{ri}d{ aj } 



(t-*')4, 



[& + &^]P eq : K (t') (77) 



^eq + ^ df <<x(* - [«T + ) cq : «(*') 
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Here C is the adjoint operator of Cq , which is defined via the following relation. 

J d{r l }d{a ] }BC P ccl = £ J d{r i }d{o J } (4B)P eq (78) 

z,{ s i} z,{si} 

where B is an arbitrary operator. We also defined the time-shifted operator of B as follows, from 
the fact that C works as the equilibrium time evolution operator. 

B(t) = e tc °B (79) 

where B is again an arbitrary operator. 

We note that the detailed balance condition is essential in the preceding derivation of the linear 
response formula. If the model does not satisfy the detailed balance, generally the simple Green- 
Kubo type formulae such as eq (|43| do not hold. Although we can calculate the linear responses 
even if the detailed balance condition is not satisfied, generally the resulting expressions do not 
reduce to the Green-Kubo form. 

References 

[1] Doi M, Edwards SF, "The Theory of Polymer Dynamics", (1986), Oxford University Press, 
Oxford. 

[2] Ianniruberto G, Marrucci G, J Rheol, 45, 1305 (2001). 

[3] Marrucci G, Ianniruberto G, Phil Trans R. Soc Lond A, 361, 677 (2003). 

[4] Likhtman AE, Graham RS, J Non-Newtonian Fluid Mech, 114, 1 (2003). 

[5] Kremer K, Grest GS, J Chem Phys, 92, 5057 (1990). 

[6] Hua CC, Schieber JD, J Chem Phys, 109, 10018 (1998). 

[7] Masubuchi Y, Takimoto J, Koyama K, Ianniruberto G, Greco F, Marrucci G, J Chem Phys, 
115, 4387 (2001). 

[8] Doi M, Takimoto J, Phil Trans R Soc Lond A, 361, 641 (2003). 

[9] Schieber JD, Neergaard J, Gupta S, J Rheol, 47, 213 (2003). 
[10] Likhtman AE, Macromolecules, 38, 6128 (2005). 
[11] Nair DM, Schieber JD, Macromolecules, 39, 3386 (2006). 
[12] Kindt P, Briels WJ, J Chem Phys, 127, 124901 (2007). 
[13] Laso M, Ottinger HC, J Non-Newtonian Fluid Mech, 47, 1 (1993). 

[14] Halin P, Lielens G, Keunings R, Legat V, J Non-Newtonian Fluid Mech, 79, 387 (1988). 
[15] Murashima T, Taniguchi T, J Polym Sci B: Polym Phys, 48, 886 (2010). 
[16] Yasuda S, Yamamoto R, Europhys Lett, 86, 18002 (2009). 
[17] Yasuda S, Yamamoto R, Phys Rev E, 81, 036308 (2010). 
[18] http:/ /www. peta.co.jp/. 

[19] Susukita R, Ebisuzaki T, Elmegreen BG, Furusawa H, Kato K, Kawai A, Kobayashi Y, 
Koishi T, McNiven GD, Narumi T, Yasuoka K, Comp Phys Comm, 155, 115 (2003). 



20 



[20] Narumi T, Ohno Y, Futatsugi N, Okimoto N, Suenaga A, Yanai R, Taiji M, in 11 Proceedings of 
NIC Workshop 2006, From Computational Biophysics to Systems Biology", NIC Series, 34, 
29 (2006). 

[21] http://www.clearspeed.com/. 

[22] "ClearSpeed Software Development Kit Reference Manual Version 3.0" , (2008), ClearSpeed. 
[23] http://cell.scei.co.jp/. 

[24] http: / / www.ibm.com / dcvclopcrworks / power / cell / . 

[25] "Cell Broadband Engine Programming Handbook Version 1.11" , (2008), IBM. 
[26] http://www.nvidia.com/object/cuda_home_new.html. 

[27] Lindholm E, Nickolls J, Oberman S, Montrym J, IEEE Micro, 28, 39 (2008). 

[28] "NVIDIA CUD A Programming Guide Version 2.1", (2008), NVIDIA. 

[29] http : / /ati . amd . com/ technology / streamcomput ing / . 

[30] "ATI Stream Computing User Guide rev 1.4.0a", (2009), AMD. 

[31] Owens JD, Lucbkc D, Govindaraju N, Harris M, Kriigcr J, Lcfohn AE, Purcell TJ, Comp 
Graph Forum, 26, 80 (2007). 

[32] Hamada T, Iitaka T, arXiv:astro-ph/0703100. 

[33] Anderson JA, Lorenz CD, Travesset A, J Comp Phys, 227, 5342 (2008). 

[34] Cohen J, Garland M, Comput Sci Eng, 11, 58 (2009). 

[35] Preis T, Virnau P, Paul W, Schneider JJ, J Comp Phys, 228, 4468 (2009). 

[36] Januszewski M, Kostur M, Comp Phys Comm, 181, 183 (2010). 

[37] Schieber JD, J Chem Phys, 118, 5162 (2003). 

[38] Masubuchi Y, Iannirubcrto G, Greco F, Marrucci G, J Chem Phys, 119, 6925 (2003). 

[39] van Kampen NG, "Stochastic Processes in Physics and Chemistry" , 3rd ed, (2007), Elsevier, 
Amsterdam. 

[40] Glauber RJ, J Math Phys, 4, 294 (1963). 

[41] Evans DJ, Morris GP, "Statistical Mechanics of Nonequilibrium Liquids", 2nd ed, (2008), 
Cambridge University Press, Cambridge. 

[42] Risken H, "The Fokker-Planck Equation" , 2nd ed. (1989), Springer, Berlin. 

[43] Ramirez J, Sukumaran SK, Likhtman AE, J Chem Phys, 126, 244904 (2007). 

[44] Honcycutt RL, Phys Rev A, 45, 600 (1992). 

[45] Harris M, "Optimizing Parallel Reduction in CUDA" , in NVIDIA CUDA SDK. 

[46] Narumi T, Sakamaki R, Kameoka S, Yasuoka K, in "Proceedings of Ninth International 
Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT 
2008)", 143 (2008). 

[47] Masubuchi Y, Ianniruberto G, Greco F, Marrucci G, J N on- Newtonian Fluid Mech, 149, 87 
(2008). 



21 



[48] Strobl G, 11 The Physics of Polymers", 2nd ed., (1997), Springer, Berlin. 

[49] Fetters LJ, Lohse DJ, Colby RH, in "Physical Properties of Polymers Handbook", Mark JE, 
cd, 2nd ed., 445 (2007), Springer, New York, 

[50] Graessley WW, "Polymeric Liquids and Networks: Dynamics and Rheology" , (2008), Taylor 
and Francis, New York. 

[51] Cates ME, McLcish TCB, Marrucci G, Europhys. Lett., 21, 451 (1993). 

[52] Mead DW, Larson RG, Doi M, Macromolecules, 31, 7895 (1998). 

[53] Graham RS, Likhtman AE, McLeish TCB, Milner ST, J. Rheoi, 47, 1171 (2003). 

[54] "Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architec- 
ture", (2009), Intel. 

[55] Everaers R, Sukumaran SK, Grest GS, Svaneborg C, Sivasubramanian A, Kremer K, Science, 
303, 823 (2004). 

[56] Kroger M, Comp Phys Comm, 168, 209 (2005). 

[57] Tzoumanekas C, Theodorou DN, Macromolecules, 39, 4592 (2006). 

[58] Masubuchi Y, Uneyama T, Watanabe H, Ianniruberto G, Greco F, Marrucci G, J Chem Phys, 
132, 134902 (2010). 

[59] Rubinstein M, Colby RH, J Chem Phys, 89, 5291 (1988). 



22 



Figure and Table Captions 



Figure [TJ Storage and loss moduli calculated by the single chain slip-spring model. Circles and 
crosses indicate G'(ui) and G"(uj) calculated by simulations. 

Figure [5] The longest relaxation time calculated from the linear viscoelasticity data. Broken lines 
show slopes 2, 3 and 3.48, which correspond to the Rouse type relaxation, the pure reptation type 
relaxation, and the exponent obtained by the fitting, respectively. 

Figured] Zero shear viscosity calculated from the linear viscoelasticity data. Broken lines show 
slopes 1 and 3.4. The critical bead number is estimated to be N c — 14.2. (The arrow indicates the 
critical bead number.) 

Figure @] Viscosity growth curves with various shear rates for TV = 40. The shear rates are 
7T = 0.0025, 0.005, 0.01, 0.025, 0.05, and 0.1. The broken line shows the zero shear viscosity 
calculated from the linear viscoelasticity data. 

Figure [5] Dynamic and steady viscosities, t]*(uj) and 77(7). Circles indicate T)*(u>) and the crosses 
(with curves) indicate the 77(7). 

Figure [BJ Relative contribution of the virtual stress to the shear relaxation modulus. 

Figure [TJ (a) Comparison of shear relaxation moduli calculated by original and our slip-spring 
models. Solid curves are calculated by our model. Dotted curves are the data of the original 
slip-spring model with or without the CR effect (taken from Fig. 4(a) of Ref 10). N = 8, 16, 32, 64 
and 128, from left to right, (b) The same data with (a) but data calculated by our model are 
shifted horizontally and vertically. 

Figure [HJ Comparison of zero shear viscosities calculated by original and our slip-spring models. 
Circles are calculated by our model (and the same as the data shown in Figure [3]) . Crosses and 
triangles are the data of the original slip-spring model with and without CR (taken from Fig. 5(a) 
of Ref ED). 

Table El Calculation times with various simulation parameters on a CPU and on a GPU. N,M,j 
are the number of beads per chain (polymerization index), the total number of chains, and the 
shear rate, tcpv and topv are the calculation times on a CPU (Intel Core 2 Duo E8500, single 
thread) and on a GPU (NVIDIA Tesla C1060, 128 x 128 threads), respectively. Simulations are 
performed from t — (in equilibrium) to t = 100 with the time step size At = 0.01. The speed-up 
factor is defined as the ratio of calculation times, tjcpuAgpu- 
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Table 



N x M 
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£cpu[s] 


*GPU[S] 




speed-up 


16 x 4096 





2.66 x 10 2 
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297 


16 x 4096 


0.05 


2.65 x 10 2 
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